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ABSTRACT 

In this work I study the problem of E/B-mode separation with binned cosmic shear two-point 
correlation function data. Motivated by previous work on E/B-mode separation with shear 
two-point correlation functions and the practical considerations of data analysis, I consider 
E/B-mode estimators which are linear combinations of the binned shear correlation function 
data points. I demonstrate that these estimators mix E- and B-modes generally. I then show 
how to define estimators which minimize this E/B-mode mixing and give practical recipes 
for their construction and use. Using these optimal estimators, I demonstrate that the vector 
space composed of the binned shear correlation function data points can be decomposed into 
approximately ambiguous, E- and B-mode subspaces. With simple Fisher information esti- 
mates, I show that a non-trivial amount of information on typical cosmological parameters is 
contained in the ambiguous mode subspace computed in this formalism. Next, I give two ex- 
amples which apply these practical estimators and recipes to generic problems in cosmic shear 
data analysis: data compression and spatially locating B-mode contamination. In particular, 
by using wavelet-like estimators with the shear correlation functions directly, one can pinpoint 
B-mode contamination to specific angular scales and extract information on its shape. Finally, 
I discuss how these estimators can be used as part of blinded or closed-box cosmic shear data 
analyses in order to assess and find B-mode contamination at high-precision while avoiding 
observer biases. 
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1 INTRODUCTION 

Cosmic shear, or the weak gravitational lensing of background 
galaxies by cosmological density fields, is one of the most im- 
portant tec hniques for probing the properties of Dark Energy 
(see, e.g., IWeinberg et al.l 1201 2\ for a recent review) and also 
the growth of structure predicted by General R e lativity (GR) 
or its possible modifications (e.g., ISchmidtl 120081 : iBevnon et alj 
l20ld : | Vanderveld et alj [201 lh . Cosmic shear measurements can 
also help constrain other cosmologic ally interesting signals, such 



as primordial non-Gaussianity (e.g.. iFedeli & Moscardini 
Marian et al]|201 ll ; iMaturi et al.l 1201 ll : iGiannantonio etai 
Hilbert et alj|2012h . 



such as the DEsS LSSlQ, Euclid^, WFIRST0, HScfl KIDS^, and 
Pan-STARR£| surveys, will measure the shapes of hundreds of 
millions to billions of galaxies and thus cosmic shear signals with 
unprecedented statistical precision. 

Given this incredible statistical power, understanding and mit- 
igating potential systematic errors in these measurements will be 
very important. Systematic contamination to cosmic shear signals 
can arise from a variety of sources, including the process of ob- 
serving and estimating galaxy shapes from pixelat ed im ages (e.g.. 



2010 



2012 



t he properties of various hot and warm dark 



matter models (e.g ISchaefer etall 120081 : iDebono et ail l20ld: 



1999 



2008 



Song & Knox 



Ichiki et alj 



2004: buzik & Bernstem! 12005 


; Paulin-Henriksson et alj 


20081; 


CvDriano et al. 


2010 


: Voigt & Bridle! 20ld: Kacprzak et al.l 


20 la 


Refregier et al 


|2012 


:IVoigt et alj|20ll Antonik et alj|2012h or es- 



Markovic et al .11201 1 ), or the properties of neutrinos (e.g., Coorav 



... 20041: 

20091 : Ide Bernardis et alj 120091 : Ijimenez et al 



Hannestad et al. 20061; iKitchin^etal 



20101) . To this end, ongoing and planned wide-field optical surveys, 



E-mail: beckermr@uchicago.edu 



1 The Dark Energy Survey - http://www.darkenergysurvey.org 

2 Large Synoptic Survey Telescope - http://www.lsst.org 

3 http://sci.esa.int/euclid 

4 Wide-Field Infrared Survey Telescope - http://wfirst.gsfc.nasa.gov 

5 Hyper Suprime-Cam - http://www.naoj.org/Projects/HSC 

6 The Kilo Degree Survey - http://kids.strw.leidenuniv.nl 

7 The Panoramic Survey Telescope & Rapid Response System 
http://pan-stans.ifa.hawaii.edu 
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20061: iBridle & Kind 12001 ISun et alj [20091; iHearin et alj l20ld : 
Bernstein & Hutererll20ld : ICunha et alj|2012h . There are also as- 



("e.g., iHeavens et al. 2000: 


Croft & Metzled 200d: Catelan et alj 


200 ll; Crittenden et al. 


2001 


, 2002; Jing 2002; Lee & Pen 


2002; 


Hirata & Seliakll2004 


Hevmans et al. 2006b; Hui & Zhang 


I200& 


Semboloni et alj|2008l). source aalaxv clustering dSchneider et alj 


2002bh, or the effects of baryons and galaxy formation on 



20041; IHuterer & Takadal [20051; I Jing et alj|2006l; iRudd et 



Guillet et al 



the matter power spectrum (e.g., WhiteJ_ 20041: ZJtaii&Knox| 



2*oT^r[variDaalerietalT 201 1 ; Casarini et 



al. 



2008 



2011 



20121 : IHearin et al.l l2012h ~ These systematic errors, if left un 
controlled, can bias and/or degrade constraints on the proper- 
ties of Dark Energy or modifications to GR from future sur- 



vevs (e.g.. iHirata & Seliak 2003,. 12004 


Guzik & BernsteirJ 20051: 


Huterer & Takada 


120051; IHuterer et al. 


20061; iMandelbaum et al. 


2006al: iMa et alj 


20061: IBridle & King 


2007; Hirata et al. 


2007; 


Hearin & Zentner 


20091: Sunetal. 20091: Semboloni et al. 


2009; 


Bernstein & Huterer 201Q; Hearin et alj 20ld: Kirk et alj 


2012; 


Laszlo et al.ll2012l;ICunha et alj2012l;lHearin et alj|2012h. 



Besides direct image and structure formation simulations 
to study cosmic shear data analysis, s ystematic s, and theoret - 

STEP1 dHeymans etal] l2006al) : 



2010) 



ical mo deling in deta i l (e.g ., 

STEP2 dMassev et al] 120071): G REAT08 dBridle et al 
GREAT 10 dKitching et al.ll2012h: Ijain et alj 1200(1 IVale & Whitel 
20031: iLee& Penl |2008|; Iffilbert et alj j2009l; ISato et al.l 2009; 

iJ 12011 



Tevssier et al.l 120091: lHahn et alj 120101 : iKiessling et alj 12011 , 
Harnois-Deraps et al ] l2012h . it is important to distinguish between 



observa tional s i gnals which can arise from GR and those which 
cannot dKaiserl 1992h . At first order in the gravitational poten- 
tial, GR will only produce cosmic shear patterns known as E- 
modes (see, e.g., iDodelsonl 120031 for a pedagogical introduc- 
tion). The complementary patterns, know as B-modes, are not 
produced by GR at first order, th ough they can be produced in 
small amounts at higher order (e.g.jjain et al flgoodj, Cooray & Hu 



2002 



2010 



Vale & White! 120031: IffiTbert et al.1 120091 : iBernardeau et al 



Krause & Hiratd |2010|) . Many of the sources of system 



atic contamination, t hough not all, can produce B-modes in addi- 
tion to E-modes (e.g., Crittenden_etal|200 1 , 2002 ; SchneMeret^alJ 
2002bl:IVale et alj|2004l; [Hirata & Seliakll2004l ; |jarvis & Jairj|2004 



Guzik & Bernsteml2005l : lAntonik et al.l2012l) . Therefore assessing 
B-mode contamination in cosmic shear signals can test for system- 
atic errors throughout the various steps of cosmic shear data analy- 
sis, from observing the galaxies with a telescope and imaging cam- 
era, all the way through to the theoretical modeling and constraints 
on cosmological parameters. 

Methods for separating E- and B-modes in cosmic shear 
data have been studied extensively by previous authors. 
Broad ly, these methods eit h er operate d ir ectly on the shear 
field dSchneider et alj [l99l; ISeliakl 1 19981 ; iHu & White! 1200 ll : 



iHeavensJ \200$ . 



correla tion 
2002bl: 



Leonard et al ■H2012h or on the shear two-point 
Crittenden et al.1 20021: S^lmeider et alj 



functions 



Schneider & Kilbingeri l200l ISchneider et al.1 l20ld : 



Fu & Kilbingerl l2010h . E/B-mode separation in the context 

of hi gher-order correl a tion functions has been studi ed as 

well dJarvis et alj 120041 : ISchneider et"ai] 120051 ; IShi et alj 1201 ll ; 
iKrause etalj|2012h . Additionally, techniques originally designed 
for the anal ysis of Cosmic Microwave Back ground polarization 
signals (e.g JWandelt et al.ll200 ll; ISmithll2006h can also be applied 



to cosmic shear 1 Hikage et al. 201 lh . Importantly, the details of 
the implementation of these met h ods can effect their pe rformance 
significantly (e.g., ISmithl 120061: iKilbinger et ail 120061) . For the 



shear two-point correlation functions, £±(#) defined below, 
ISchneider & KilbingeJ d2007h have shown that a broad class of 
E/B-mode statistics, can be written in the following form 



B c = - 
2 



deo[T + (6)$+(p)+T-{9)Z-(0)] 

d0O[T + {o)t+(e)-T-(9)t-(e)] 



By choosing the range of integration [L, H] and the forms of T± (8) 
properly, one can show that E c will contain only E-mode informa- 
tion and B c will contain only B-mode infor mation either over an in- 
finite interval or finite interval with L > dSchneider & Kilbingerl 
l2007h . Note that these statistics assume one has continuous shear 
correlation function data. 

Practical implementations of the E c and B c statistics in cos- 
mic shear data analysis are constrained in several ways. The shear 
correlation functions are usually estimated in bins of angle, say N 
bins, with some effective binning weight Wi(6). In particular, the 
expectation value of the es timated shear correla tion function data 
point for the ith bin is (e.g JSchmidt et al.ll2009t) 



Hi 



d0Wi(O)Z±(O) 



(1) 



Li 



The form of these binning functions and this result is discussed 
below and in Appendix lAl Thus the statistics E c and B c in this 
case must be estimated from the binned shear correlation function 
data, Also, in order for the statistics E c and B c to remain pure 
two-point statistics, any procedure for estimating them from cos- 
mic shear data can only consider linear combinations of the binned 
shear correlation function data, 



X± 



(2) 



where the F±j are constan ts which describe the statistics (see e.g., 
IKilbinger & Munshill2006l) . 

In this work, I study the use of these linear combinations for 
E/B-mode separation and thus the effects of these constraints on 
cosmic shear data analysis. In particular, after covering the basics 
of cosmic shear in Section[2] I demonstrate in Section l3~71 that even 
if the statistics E c and B c contain pure E- and B-mode informa- 
tion, the statistics X± generally exhibit E/B-mode mixing due to 
the binning. In Section l3~2l I show how to define the statistics X± 
through the choice of the F±i, so that they suppress the E/B-mode 
mixing below detectable levels for any current or upcoming cos- 
mic shear survey. In this section I also give practical recipes for 
computing the F±i. Other potential and ultimately equivalent opti- 
mal estimator definitions are discussed in Section [3~3l I then show 
how to use these optimal estimators to divide the vector space of 
correlation function data points up into approximately ambiguous, 
E- and B-mode subspaces in Section [3~4l I compute the Gaussian 
covariances of these statistics in Section 1331 In Sections |4. 1 1 and 
14.21 I provide two examples of these statistics to illustrate how to 
build and use them in practice. I provide an example of how to 
decompose the shear correlation functions into ambiguous, E- and 
B-modes and also discuss the Fisher information content of these 
subspaces in Section l4~3l I find that the ambiguous mode subspace 
has a non-trivial amount of information about typical cosmological 
parameters. Finally, I conclude and discuss how these statistics are 
applicable to blinded or closed-box cosmic shear data analyses in 
Section|5] 
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2 COSMOLOGICAL WEAK LENSING 

The basic equations describing weak lensing by cosmological 
density fields are cover ed in detail in other works (see e.g.. 



Bartelmann & Schneider! 1200 ll : iDodelsonl 120031 ; iHoekstra & Jam! 
20081 : iBartelmannluOlOl) . The discussion presented here is merely 



a summary of the relevant results needed for this work. Given the 
3D density matter power spectrum P(k, a) as a function of wave 
number k and scale factor a, the 2D convergence power spectrum 
as a function o f 2D wave number £ is defined in the Limber approx- 
imation as ("cf. lHoekstra & J ain 2008) 



cm = 

WiAx) = 



Wi{x)W 3 (x) 



' S 2 r. 



Ho 

c 



P(e/ X (z),a) 

ni,j(Xs) Xs-X 



dXs 



where z is the redshift, x( z ) is me comoving distance, and rii(x) is 
the redshift distribution of the lensing sources for source set i nor- 
malized to the total source density, fiij = J dXs n i,j{Xs)- These 
expressions assume straight-line photon paths from the sources to 
the observer, commonly called the Born approximation, and a spa- 
tially fl at universe. I use th e non-linear power spectrum fitting for- 
m ula of Sjriithetal] 1 2003b to compute Cfj and the fitting formula 
of lEisenstein & Hul (l9<m to evaluate the linear power spectrum. 
Additionally, in this work all lensing sources are at a single red- 
shift, z s = 1.0, such that n(xs) = 5(xs — x( z s))> where S(x) is 
the Dirac delta function. 

In cosmological weak lensing, one observes the shear field 
(neglecting reduced shear effects, see e.g., ISchneider& Seitzl 19951 : 
iMandelbaum et alJl2006bT) along with an assumed to be random 
contribution from galaxy shapes and orientations. This last effect 
is commonly called shape noise and is characterized by the shape 
noise per component, a e . The breaking of the assumption of ran- 
dom galaxy orientations is generically referred to as intrinsic align- 
ments and is a primary source of systematic error in cosmic shear 
measurements (see the references given above). Given the complex 
shear field, 7 — 71 + ^72, one can define the £ ± correlation func- 
tions as (cf. lSchneider et alj|2002bh 

£+ = (7t7t> + <7x7x) 
C- = (7*7t) - <7x7x) 

where y t = —Re( r ye~ 2l ' t '), -y x = —Im(-ye~ 2l ' t '), and <f> is the po- 
lar angle of the vector connecting the two points. In Appendix lAl 
I present the standard expressions for galaxy pair-wise shear cor- 
relation function estimators and their expec tation values (see, e.g., 
ISchneider et alj|2002al : ls"chmidt et alj|2009l) . 

In Fourier-space, the shear field is typically separated into 
a component with no net handedness, the E-mode part, and a 
handed component, the B-mode part. In terms of the power spec- 
tra of the E- and B-mode par ts, these correlation functions are (cf. 
ISchneider & KiIb inger 20o2) 



2tt 

dee 



Mm [P E {£) + P B {1)\ 



j4(te){p E (t)-PB(£)} 



(3) 



(4) 



where the J n (£9) are cylindrical Bessel functions. In the Born ap- 
proximation, the B-mode power is identically zero, Pb(£) = 0, 
and the E-mode power is equal to the convergence power spec- 
trum, Pe{£) = Cjj. Corrections to the Born approximation are 
very small (e.g. Jjain et alj200d : ICoorav & Hul200a : IVale & White! 



20031: iHilbert et alj|2009l : iBernardeau et alj|2010l : iKrause & Hiratal 



20J_y). The shape noise contribution to the power spectra, cr e /r 



has been purposefully left out of these expressions because pair- 
wise estimators of the shear correlation functions (s ee AppendixlAt 
do not exhibit noise biases ( ISchneider et all2002al) . 

Finally, the covariance of the shear-shear correlation func- 
tions can be computed under the assumption that the shear fields 
are Ga ussian with the following expressions from I Joachimi et al] 
d2008h 

(£+/-(0i)£ +/ -(02)) = 

^fj Ml J o/4 W Jo/4 (19%) x [p%{£)+Pl{£) 



— / d££J (£6i)J4(£6 2 ) 

x \p 2 E {e) + Pl{l) + ^ [Pe{£) + Pb{£)] 

n 



(5) 



where A#i is the bin width and Q s is the survey area. Note that the 
shape noise contributes to cross terms in braces in addition to the di- 
agonal terms given by the Kronecker delta function. The covariance 
between and is given by second of the above expressions. 

The expressions for the correlation function covariance matrix 
will be useful below for computing the Fisher information content 
of the shear correlation functions under the assum ption the errors 
are G aussian. The Fisher information matrix is (e.g. jTegmark et al] 
1 1997b 



(6) 



where C is the covariance matrix of the observations, A; = 
C _1 C,i, and My = /ix,, jiZj + flj jf^. Here pi is the vector of mean 
values of the data and the notation , i indicates a partial derivative 
with respect to parameter 0i. Below I neglect the information in the 
covariance matrix so that the Fisher information is computed only 
using the last term in the equation above. There are several conven- 
tions in the literature for comparing th e Fisher informa t ion co ntent 
of various analyses. I roug hly follow ISchneider et"ai] d2010h and 
simply compare various analyses by computing / = \/\F\ for a 
fiducial set of parameters for each analysis. I use erg, the normaliza- 
tion of the linear matter power spectrum today filtered in 8 /i _1 Mpc 
spheres, and Q m , the mean matter density today in units of the crit- 
ical density, as the fiducial set of parameters for comparison. These 
Fisher information estimates are not meant to be realistic survey 
projections, but to give a sense of how much of the total Fisher in- 
formation is retained by the E/B-mode statistics presented in this 
paper. 

Throughout this work I assume a flat ACDM universe with 
Q m = 0.25, H = h^WO kms _1 Mpc -1 , h = 0.7, a 8 = 0.8, 
n s = 1.0, and fij, = 0.044. Also, I consider two different pro- 
totypical weak lensing surveys with different areas, Q. s , and lens- 
ing source number densities, n. The first survey has (Q„,n) = 
(5000 deg 2 , 10 gals/arcmin 2 ), typical of the DES and the sec- 
ond has (fl s ,n) = (20,000 deg 2 , 40 gals/arcmin 2 ), typical of 
the LSST survey. I set the lensing shape noise per component to 
(j e = 0.3 for both the DES- and LSST-like surveys. 



© 2012 RAS, MNRAS 000. [71051 



4 M. R. Becker 



3 E/B-MODE ESTIMATION WITH BINNED DATA 

In this section I study the problem of E/B-mode estimation with 
binned cosmic shear data in detail. I show that generally estima- 
tors which are linear combinations of the binned cosmic shear data 
points mix E- and B-mode signals. I then demonstrate how to define 
optimal estimators which minimize this E/B-mode mixing and give 
practical recipes for constructing and using these estimators. Next 
I discuss other potential optimal estimator definitions, demonstrat- 
ing that they are approximately equivalent to the definition used 
throughout this work. Then I discuss how to decompose the vector 
space composed of the binned shear correlation functions into ap- 
proximately ambiguous, E- and B-mode subspaces. Finally, I com- 
pute the variance and covariance of these estimators under the as- 
sumption the shear power spectra are Gaussian. 



3.1 Binned Data & E/B-mode Mixing 

As stated above, shear correlation functions are typically measured 
in a bins of angle so that 9 £ [Li, Hi] for the ith bin. Additionally, 
for the standard shear two-point function estimator, the shear cor- 
relation function measurements are weighted by the bin weighting 
function Wi(8), as in Equation (TJ. Generally these bin weight- 
ing functions are normalized to unity so that d9Wi(9) = 1. 
Throughout this work I assume the bin weighting functions are 
normalized properly. As detailed in Appendix [A] for pair- wise es- 
timators of the shear correlation functions this bin weighting func- 
tion is in general quite complicated because of the survey win- 
dow function and source galaxy clustering. Note however that for 
unclustered sources and neglecting the survey window function, 
Wi{9) = 29/(Hf - L?), so that the dominant effect is a purely 
geometric weighting. This effect arises from the increase in the 
number of galaxy pairs due to the increase in the area in the outer 
part of the bin relative to the inner part. I show in Appendix [A] 
that by using small enough bins, the effects of the source clustering 
can be made negligible. Assessing the form and magnitude of the 
weightings from the survey window function is beyond the scope of 
this work. Thus when needed, I only use the geometric weightings 
given above. 

Given the bin window functions, I now consider statistics 
which are general linear combinations of the shear correlation func- 
tion data points described by Equation (2). The form of these statis- 
tics has been chosen for easy comparison with work for E/B-mode 
statistics for unbinned shear c orrelation function data presented by 
ISchneider & Kilbingerl d2007h . It is important to realize however 
that I have included a bin size dependent weight through the nor- 
malization of the Wj( 9 ) in t he definition of the statistics unlike 
ISchneider & Kilbingerl d2007l) . The correspondence between the 
F± coefficients, whic h define the statistics X±, and the {E c , B c } 
statistics presented in ISchneider & Kilbingerl d2007h is T±(0) = 
Wi{0)F ±i /e for each bin i. 

I will now show that the statistics X± always mix E- and B- 
mode information, barring a very special choice of the bin window 
functions Wj(9). This derivation follow s closely a derivation pre- 
sented in ISchneider & Kilbingerl l2007h , but accounts for the bin- 
ning explicitly. In order to proceed, I substitute the definitions of 
the shear correlation functions in terms of the E- and B-mode power 
spectra from Equations ([3j and © into the definition of X± to get 
the following expression for the expectation value of the statistics 



W±(£) 



2tt 



J2[ f+i / ' dewi(e)j (te) 



±F-i J d9W,{9)J A {£9) S j (8) 

These equations with P b(£) = were presented by 
iKilbinger & Munshil 1200^) . The statistic X+ will have no B-mode 
contribution if W-(£) = for all £ and thus would be an E- 
mode statistic. Similarly, X- would be a B-mode statistic. Setting 
W-(£) = 0, 1 can integrate over £ against £ J4(£<f>) to get 

Hi 



]T^-< / d9Wi(9) / d££J i (£9)J±(£(t>) 

i J Li Jo 

rHi /*oo 

= Yl F+i d9W l (9) d££J (£9) J 4 {£(j>) 



This equation must hold for arbitrary <j>. Now let <j> £ [Lk, Hk]. 
Using the closure relationship for the Bessel functions 



d££J 4 (£9)J 4 (£(j}) = -5(6 



and the following result from lSchneider et all d2002bl) 



(9) 



G(8, 



d££J (£8)J 4 (£4>) 



4 m ^H { 6-e ) + li 



I get the following expression for F-k in terms of the F+i 



F^ k = F +k + 



W k (cj>) 

■min(Hi,<p) 



Pe(£)W±(£) + P B {£)W T {£) 



(7) 



Here H(cf> — 9) is the Heaviside step function and 5((f> — 9) 
is the Dirac delta function. Thi s equation is the discr e te ana - 
log of the results presented in ISchneider & Kilbingerl J2007h . 
In f act, it can be obtaine d by setting the quantities T±(<j>) 
from ISchneider & Kilbingerl d2007h to W l (<f})F± i /<f> in each bin 
[L h Hi]. 

Originally I assumed that the F±i were constants not functions 
of <f>, but the equation derived above in general has dependence on 
(f>. Therefore generally W~ (£) will not be identically zero for all 
I. The only way to ensure W- (£) = is to pick the bin window 
functions Wi(cf>) so that they exactly cancel the (f> dependence in 
the equation above and then use this equation to compute the F-i 
in terms of the F+i. Apart from this caveat, statistics of the form 
X± will mix E- and B-mode information in general. This mixing 
arises directly from the binning and is a general fe ature of E/B- 
mode separatio n with pixelized data as well (see e.g.. lSmithll2006l: 
lLinetalJl201lh . 

These results have important implications for the theoreti- 
cal analysis of binned cosmic shear correlation function data. For 
binned shear correlation functions with N bins, it would be nice 
to divide the space of the 2N data points into pure E-mode, 
pure B-mo de, and ambiguous mod es, similar to the decomposition 
achieved bv lSchneider et al.l J2010h with the COSEBI statistics over 
the continuous function space of the shear correlation functions. 
However, I have shown that for general window functions Wi(9), 
this decomposition is impossible because there do not exist pure E- 
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100 1 



10 100 1 

9 [arcmin] 



Figure 1. The first nine components of the COSEBI-like basis for 50 shear correlation function data points with 9 £ [1, 400] arcmin. The thick black lines 
show the F+ filters and the thin blue lines show the F— filters. Each set of F± filters has been normalized to the same scale in each panel. 




Figure 2. The signal-to-noise of the COSEBI-like E- and B-mode statistics as a function of the number of shear correlation function bins. The solid lines show 
from top to bottom the signal-to-noise of the first 12 E-mode statistics (estimated using only the diagonal elements of the covariance matrix). The dashed lines 
show the signal-to-noise in the B-mode statistics. The dotted line in each panel marks a signal-to-noise of unity. The left panel is for a DES-like survey, while 
the right panel is for an LSST-like survey. The intrinsic B-mode power was set to zero for this computation, so any statistically significant B-mode statistics 
are due purely to E/B-mode mixing. This mixing decreases as the number of shear correlation function bins increases. Additionally, the fact that only ~8 — 10 
of the E-mode statistics are statistically significant illustrates the data compression properties of these statistics. 
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and B-mode linear combinations. In the next section, I will define 
approximately pure E- and B-mode linear combinations, along with 
approximately ambiguous modes. Thus I will show that there does 
in fact exist a division of the space of 27V data points into approx- 
imately ambiguous, E- and B-mode subspaces. This approximate 
decomposition is explored fully in Section fJ!4l 

These results are also quite useful for cosmic shear data anal- 
ysis. Suppose one did in fact use a statistic which is a linear com- 
bination of the shear correlation function data points. Then from 
Equations {7} and ([8} one can compute how the statistic mixes E- 
and B-modes due to the binning. Additionally as I will show be- 
low, as the shear correlation function bins are made smaller, the 
magnitude of W- (I) will decrease, so that the E/B-mode mixing 
decreases as well. If the bins are made small enough, the bias in 
X-, the B-mode statistic, due to E/B-mode mixing, can be made 
smaller than the statistical errors. Thus these results give a quanti- 
tative criterion by which to decide the number and size of the bins 
used to compute the shear correlation functions. Finally, Equation 
j 101 > suggests two ways to minimize the E/B-mode mixing. The 
first is to reweight the data in each bin by choosing the Wi((j>) to 
cancel the cj> dependence in Equation dlOb and then use this equa- 
tion to compute the F-i. I will not consider this possibility here. 
The second is to pick the F-i in order to minimize the E/B-mode 
mixing without adjusting the bin window functions. Given a fidu- 
cial choice for the F+i, one can define F-i in some way (roughly 
similar to Equation dlOb ) in order to minimize the mode mixing 
by minimizing W-(£). The details of this definition along with a 
general procedure for computing the F±i are described in the next 
section. 



3.2 Building Binned E/B-mode Estimators 

Consider now binned shear correlation function data over the range 
[L, H] in angle with TV bins. Each bin is described by the bin 
window functions introduced above so that the bins cover the 
range [L, H] without any overlaps between the ranges of each bin, 
[Li, Hi]. In order to define the F-k, I minimize the square ampli- 
tude of the window function W- (£) with respect to the coefficients 
F- h , 



= 



d 



dF- 



Mi\W-(l)[ 



(11) 



The solution to this equation is 



F- 



F +k + 



dO 



E^ + 



'Hi rH k 

< / / d0dci>Wi(e)w k (4>) 

'Li J L k 

126 2 



H{cf-e) 



(12) 



This solution is equivalent to multiplying Equation J 1 Ob by 
Wk (</>) /4> an d men integrating over (j> with the weight Wk (4>) ■ 

Similarly to the unb inned estimators described in 
ISchneider & KilbingeJ d2007h , the binned estimators described 
in this work must satisfy two integral constraints in order to be 
non-zero only over a finite range in angle. These constraints can be 
understood as follows. Suppose that the F+i are non-zero only in 
the interval [L s , H s ] contained in [L, H]. Now consider <j> ^ H s . 



Then according to Equation | |12I >. the F-i will be non-zero in the 
interval [L 3 , H 3 ] only if the F+i satisfy the following constraints 



= 



E 



F+ 



BielL^Hs] 



d8Wi(0) 



= E F+l / deWi{6)( 

ei£[L B ,H B ] J L i 



n, 



(13) 



(14) 



where the sums run only over bins in the interval [L s , H s ], Given 
data in some fiducial angular interval, one can always increase the 
angular range considered as long as the F±i are zero outside the 
fiducial angular range. Thus in order for the X+ statistics to self- 
consistently consider only data in a finite angular range and min- 
imize E/B-mode mixing by minimizing W-(£), they must satisfy 
the constraints give n above. 

As discussed in lSchneider et al.U2010h . these constraints serve 
to project out ambiguous modes in the continuous correlation func- 
tions £± which cannot be uniquely classified as either E- or B- 
modes. These modes are = a + b9 2 and so these con- 

straints simply project out the binned versions of these modes. 
Barring the caveat discussed above, in the discrete case no modes 
can be uniquely classified as pure E- or B-modes. Thus in this 
sense all modes are ambiguous in the discrete case, due to the bin- 
ning. However, E/B-mode separation for the discrete modes along 
£+(6) — a + b9 2 is not possible even in the limit of infinitely 
small bins, so these discrete modes are the analogues of the am- 
biguous continuous modes. Importantly, these ambiguous modes 
can still potentially carry cosmological information, it is just that 
one cannot uniquely determine if the modes are sourced by the E- 
or B-mode power spectrum. Assuming one has determined that the 
cosmic shear data is free of systematic contamination, it may be 
advantageo us to keep an d use the information in the ambiguous 
modes (see ISmitrj[20ol for a similar point in the context of the 
pseudo-Cf formalism and also lSchneide r et al. 2010!)- This point is 
illustrated below with a simple Fisher information analysis. 

Given the linear relation between the F±i and the linear con- 
straints on the F+i, it is natural to treat them as vectors of length TV. 
Thus the two integral constraints in Equations J 1 3b and d 1 4b mean 
that the F+ vector cannot have any component along the directions 



F+a 



Hi 



dOWi(0), 



H 2 



dew 2 {0), 



d6W N (9) 



F +b = 



Hi 



dOWi(0)l 



dOW N {0)0' 



B 2 



d6W 2 (6) e 2 



Similarly, Equation J 1 2b defines a matrix, which I will denote as 
M+, which relates the F+ to the F_ via _F_ = M+F+. I give ex- 
plicit forms for these quantities assuming geometric bin weightings 
in Appendix iBl 

The overall process to create a set of F+ and F- vectors and 
measure the E- and B-mode signals in this scheme is quite straight 
forward. 

(1) Choose an initial shape for the F+ vector. Usually one has 
some external motivation for this choice, like for example, statis- 
tics which form a complete set of functions and exhibit efficient 
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data compression dSchneider e7ai]|201Cl) . statistics that are com- 
pact and non-oscillatory in Fourier space dLeonard et alj|2012l) . or 
statistics which maximize the signal-to-noi se in the measuremen t 
of cosmological parameters, like as or Q, m dFu & Kilbingerl2010h . 

(2) Make F+ is orthogonal to the F+ a and F+b vectors defined 
above. In practice, an efficient way to do this is to first make F+ a 
and F+b orthogonal to one another by modifying F+b. Then it is 
easy to make F+ orthogonal to both F+ a and the modified F+b by 
projecting out the components of F+ along the modified constraint 
directions. 

(3) Use the matrix M+ defined in Equation d 1 2b to compute the 
corresponding F- filter via F_ = M+F+. 

(4) Compute the E- and B-mode signals by forming the vector 
inner product of the binned £± shear correlation function data with 
the F± filters. Then E = X+ = (7+ + I-)/2 and B = X- = 
(1+ - J_)/2, where 1+ = t±iF±i. 

Below I use both the procedure just described and also Gram- 
Schmidt orthogonalization to define F+ functions. In the case of 
Gram-Schmidt orthogonalization, one simply uses F+ a and F+b 
first in the Gram-Schmidt process, followed by the rest of the initial 
estimator shapes for F+. The result of applying the Gram-Schmidt 
procedure is a set of orthogonal F+ filters contained in the TV — 
2 elements of the orthogonal basis. The first two elements of this 
basis span the space generated by the constraint directions F+ a and 
F+b. The F_ filters are again obtained from the matrix defined in 
Equation i l 1 2t . 



3.3 Other Potential Optimal Estimator Definitions 

In this section I will explore two other potential optimal estima- 
tors definitions. The first definition simply consists of swapping the 
roles of F+ and F_ in the previous section. Thus in this scheme the 
optimal estimator is now defined by 







d 



dF. 



+k 



d££\W-(£)Y 



(15) 



The full estimators for this case are presented in Appendixlcl In this 
case, all of the same results presented above hold and in particular 
there exists a matrix M_ which defines the F+ in terms of the F_ 
via F+ = M_F_. There also exist ambiguous modes, F- a and 
F-b, in analogy to the ones defined above. Finally, the procedure 
just given for E/B-mode separation can be carried out in exactly the 
same way as described above by first defining fiducial F_ filters, 
projecting out the F_ Q and F-b modes, computing the F+ filters 
from the matrix M_, and then computing X+. 

The second potential optimal estimator definition consists of 
minimizing the power in the W- (l) window function by varying 
both F+ and F- simultaneously. The two equations that must be 
solved for this kind of estimator are 



d 



dF_ k 
d 



dF. 



+ k 



d£l\W-{£)\' 



d££\W-(£)Y 



Using the results for the two optimal estimators already presented, 
the two equations the F+ vectors must satisfy are 

F- = M+ F+ 
F+ = M-F- . 



Combing these two relations one gets that the optimal F+ and F_ 
vectors computed in this way must satisfy 

= (M_M+-I)F+ 
= (M + M_ —T)F- , 

where I is the identity matrix. Thus in this case the F+ vectors must 
be in the null space or kernels of the matrices M T M± — I. 

A straight forward way to investigate the kernels of these ma- 
trices is via computing their Singular Value Decomposition (SVD) 
(e.g., |Pressetalj|l992h . The SVD of a matrix M is defined as 
M = UEV T . Here U and V are orthogonal matrices, assuming 
M is real, and S is a diagonal matrix with the singular values along 
the diagonal. One can show that a basis for the kernel of a square 
matrix can be found by using all of the columns of V which have 
zero singular values in S. Similar results hold for non-square matri- 
ces but they are of no use here. Additionally, the rank of the matrix 
is number of non-zero singular values so that a square matrix with 
all non-zero singular values has full rank and is invertible. 

Using logarithmic binning and the geometric bin weighting 
Wi(9), I find that the matrix M_M+ —I has two large singular val- 
ues which exceed all of the others by several orders of magnitude. 
The TV — 2 other singular values are of similar magnitude to each 
other, but much smaller than the largest two. The difference be- 
tween the two largest and the other TV — 2 singular values increases 
as the number of bins is increased in a fixed angular range. Also, 
the columns of V along the two largest singular values are approx- 
imately along the ambiguous directions F+ a and F+b. Analogous 
properties hold for the singular values of the matrix M+M_ — I 
and the ambiguous directions F- a and F-b- Unfortunately, all of 
the singular values of these matrices are non-zero so that no vector 
except the zero vector lies in their null spaces. Note that one must 
be careful to account for rounding errors when considering whether 
or not singular values are zero. I find however that the TV— 2 smaller 
singular values are well above rounding errors. Thus optimal esti- 
mators defined by varying both the F+ and F_ coefficients simul- 
taneously do not exist in this case. 

However, this analysis does provide useful insight into the re- 
lationship between the estimators defined by varying F_ and those 
defined by varying the F+ coefficients. In particular, given that the 
singular values along the directions orthogonal to the ambiguous 
directions {F+ a , F+b} and {F_ a ,F_f,} are very small, we have 
that F+ and F_ vectors orthogonal to these directions span sub- 
spaces which are close to being kernels of the matrices M+M+ — I. 
Thus 



(M_M+ -I)F+ 
(M+M_ - I) F- 



One can verify this property empirically as well. Therefore the ma- 
trix M+ generates F_ vectors which are approximately in the ap- 
proximate null space of M+M_ — I and vice versa. (To see this, 
simply multiply the two equations above by M+ and M_ respec- 
tively and use the relationship F+ = M±F±.) Thus F_ vectors 
generated from F+ vectors are very close to being orthogonal to the 
constraint directions {F_ a ,F_i,}. The analogous property holds 
for F+ vectors generated from F_ vectors and the constraint direc- 
tions {F+a, F+b}. Additionally, due to these approximate equali- 
ties, the operation of "composing" the two different estimator def- 
initions approximately returns the identity. In other words, one can 
first compute optimal estimators by varying the F_ and supplying 
fiducial guesses for the F+. Then the computed F_ vectors can be 
used to supply fiducial guesses for the set of optimal estimators de- 
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Figure 3. The first nine components of the wavelet-like estimators for 100 correlation function data points and 9 6 [1, 400] arcmin. The thick black lines 
show the F+ filters and the blue lines show the _F_ filters. The index {TV, i} of each filter is given in the lower left corner of each panel. 



fined by varying the F+ . The new set of estimators that result from 
varying the F+ will be very close to the first set of estimators de- 
fined by varying the F_ . This fact can be verified empirically. In 
this sense, the two optimal estimator definitions are consistent. 



3.4 Approximate E-, B-, and Ambiguous Mode 

Decompositions for Binned Cosmic Shear Data 

It is now straight forward to build the approximate decomposition 
of the space of 2N data points into E-mode, B-mode, and ambigu- 
ous modes discussed above. To do this properly, I must specify a 
basis for the vector space of 2N data points which can be divided 
into directions approximately along ambiguous, E- and B-modes. 
The subspaces spanned by each of these three sets of modes need 
not be mutually orthogonal, though in practice the E- and B-mode 
subspaces are approximately orthogonal to the ambiguous mode 
subspace defined below. 

First, I specify the basis vectors for the ambiguous directions. 
These vectors are 



{F+a,0} 

{F +b ,0} 

{0,F-a} 

{0,F_ 6 } 



(16) 



where the are zero vectors of length N. Next I specify the E- and 
B-mode directions. This is done by building a set of N — 2 E- and 
B-mode estimators from a set of N — 2 F+ vectors. Additionally 
these N — 2 F+ vectors together with F+ a and F+b should form a 
basis for the subspace of the total vector space composed of both 
£+ and £_ . The construction of the optimal estimators enforces that 
the N — 2 F+ vectors are exactly orthogonal to the F+ a and F+t 
vectors. Additionally, as shown above, the resulting F_ vectors will 
be approximately orthogonal to the F- a and F_f, vectors. So using 
the N — 2 F+ vectors along with their F- counterparts, one can 
define the final 2N — 4 potential basis vectors of this space as 



{F+,+F-}/2 
{F+,-F-}/2, 



(17) 



where this construction is repeated for each of the N — 2 F+ vec- 
tors. The first of these vectors is simply the direction of X+ and 
the second is in the direction of X_ so that they are approximate 
E- and B-mode directions. Finally, one must verify that this set of 
vectors combined with the ambiguous mode directions is in fact a 
basis for the vectors space of dimension 27V by, for example, deter- 
mining the rank of the matrix composed of these vectors as rows. 
Note that this construction would work just as well by first build- 
ing a basis with the F- vectors and the {F- a , F-t} vectors, and 
then computing the F+ vectors. Below, I will investigate this con- 
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Figure 4. Example B-mode detection in noise-free data with 200 correlation function points from 1 to 400 arcmin. Each row on the left shows the underlying 
true correlation functions in black and the correlation functions with systematics in blue. On the right each row shows the absolute value of the wavelet-like 
scale-location B-mode statistics defined in Section l4~2l The vertical axis label denotes how many sections the interval [1,400] was divided into in log-space 
with the intervals increasing in size from bottom to top. The interval width is reflected in the size of each colored region. The color of each region encodes the 
amplitude of the B-mode level. The top row has a smooth systematic signal while the bottom row has much more smaller scale variations. These finer scale 
variations result in B-mode detections for the smaller scale statistics for the bottom row, while no such detections are found for the top row. 



struction for the COSEBI-like l lSchneider et al.l20ld) basis vectors 
defined in Sectionl4~71 



Finally, it is important to note that the exact properties of the 
matrices M± influence the properties of the basis constructed above 
nontrivially. Specifically, if one uses the equal area binning (i.e. 
Hf — L\ is the same for all bins i), then one can show using the 
expressions for these matrices given in Appendices[B]and[C]for ge- 
ometric binning functions that M_ = M+. Thus for vectors along 
the F± directions, the matrices M± in this case are approximately 
orthogonal. A quick calculation shows that if these matrices were 
exactly orthogonal and one started out with an orthonormal basis 
for the subspace, then then the F- vectors would also be or- 
thonormal. In this case the E- and B-mode subspaces would then 
be exactly mutually orthogonal. In practice this exact orthogonality 
is not realized, but the E- and B-mode subspaces do retain some de- 
gree of orthogonality even for logarithmic binning for the COSEBI- 
like statistics described below in SectionRTTI 



3.5 The Binned E/B-mode Estimator Covariances 

With Equation 10 it is possible to compute the co variance of these 
statist ics. I follow the same method outlined in ISchneider et alj 
d2010h and get, assuming that the EB cross-power Pe b {£) is zero 
and the power spectra are Gaussian, 



Cov(E n ,E m ) = 



1 



W$(t)W?(£) P E (£) + 



+ W™(£)W™(£) P B (£) + — 

n 



Cov(B n ,B m ) 
1 

7rfL 



WZ(t)W?(£) P B (£) + — 



+ W-(£)W-(£) P E {£) + 



(18) 



(19) 
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Cov(E n ,B m ) = 



1 



WZ(£)W™(£) P E {£) + 



df I 



(20) 



These expressions reduce to those in ISchneider etall d2010h when 
W- {£) = 0. Also note that the squared amplitude of the W- (£) 
window function controls the amount of excess variance in the E- 
and B-mode statistics due to E/B-mode mixing. The procedure for 
defining F_ introduced above serves to minimize this excess vari- 
ance. Similarly, by minimizing W-(£), the covariance between the 
E- and B-mode statistics is minimized. 



4 EXAMPLES AND TESTS OF THE ESTIMATORS 

In this section, I will illustrate the ideas discussed above 
by constructing and evaluating the performance of two 
different E/B-mode statistics. I first present COSEBI-like 
dSchneider. Eifler. & Krausejl20ld hereafter SEK10) statistics in 
Section |4~T1 These statistics have ~ 10 x data compression ratios 
for upcoming surveys which can greatly simply data analysis. 
Using these estimators I will also evaluate the level of E/B-mode 
mixing as function of the number of shear correlation function 
bins. Then I present wavelet-like B-mode statistics which allow 
one to determine the scale and location of B-mode contamination 
in shear correlation function data in Section l4~2l Finally in Sec- 
tion [43] I explore the properties of the approximate ambiguous, 
E- and B-mode decomposition discussed above and the Fisher 
information content of these subspaces using the COSEB-like 
statistics. 



4.1 Binned COSEBI-like Estimators 

As demonstrated in SEK10, filter functions which are polynomi- 
als in In 6 have very optimal data compression ratios. Additionally, 
SEK10 built a set of filter functions which are orthonormal over a 
finite interval (using a slightly different definition of orthonormal- 
ity than the vector inner product used above). In order to generate 
a similar set of filters for correlation functions measured with TV 
bins, I take as an initial shape for the F+ filters, Pe(t) / exp[(ln 6 — 
In L)/2], where Pi(t) is the Legendre polynomial of order £ and t 
is simply the logarithm of the angle 8 remapped to the appropriate 
interval, t = 2(ln 6 - (In H + In L) /2)/(ln H - In L)0I increase 
£ for each vector in the basis starting at £ = 2 and I use the loga- 
rithmic mean of each bin interval to evaluate the polynomial. The 
first two vectors in the basis are the constraint directions F+ a and 
F + t so that the third vector is started with £ = 2. As the number of 
roots in the Legendre polynomials increases, they become harder 
to represent properly in the discrete binned space. Thus when the 
index of the Legendre polynomial I is greater than 2N/3, where 
TV is the number of bins in the interval, I switch to generating the 
fiducial basis by producing F+ vectors at order £ divided into I + 1 



8 Note that one can discretize the SEK10 statistics directly using the 
correspondence between the T± and F± filter functions, F±i ~ 
T±(9i)9i/Wi(0i) where 8i is a representative point in bin i. However, 
I have found that this procedure produces statistics which do not exhibit the 
data compression properties of the SEK10 statistics. 



sections alternating between —1 and +1. This construction gen- 
erates as many roots in the interval as the Legendre polynomial 
of order I would have, but these roots are now resolved properly. 
Note that the maximum value of £ is TV — 1, so that one gets ex- 
actly TV — 1 roots when TV intervals alternate between —1 and 1. 
Then the Gram-Schmidt procedure is applied to the constraint vec- 
tors and the TV — 2 fiducial F+ vectors as described above in order 
to produce the final F+ filters. The factor of exp[(t — lnL)/2] 
has been inserted to roughly account for the difference between the 
vector inner product definition and that in SEK10. In Figure Q] I 
show the COSEBI-like vector basis for F +n and F_ n for 50 shear 
correlation function data points between 1 and 400 arcmin. 

In Figure [2] I show the level of E/B-mode mixing for these 
statistics as a function of the number of shear correlation func- 
tion bins. I have plotted the ratio of the first 12 E- and B-mode 
statistics to their errors for both DES- and LSST-like surveys. Note 
that in general these statistics are correlated, but to make this plot 
I only use the diagonal contributions to the error covariance ma- 
trix. The statistics in this plot were computed with Pb (£) = 0, so 
any nonzero B-mode statistic amplitude is completely due to E/B- 
mode mixing. Additionally, the E/B-mode mixing in the estimators 
is negligible if the nonzero amplitude of the B-mode statistic is 
well below the error bar on the statistic. This condition ensures that 
any statistically significant systematic effect is detectable and is not 
confused with E-modes due to E/B-mode mixing. 

In general, the level of E/B-mode mixing decreases as the 
number of bins used for the shear correlation function increases. 
Additionally, it is clear that for a DES-like survey fewer shear cor- 
relation function bins can be used than for an LSST-like survey. If 
one uses a threshold of the B-mode statistic being no more than 
10% of the 1-sigma error bar, then a DES-like survey will need of 
order ~ 50 bins whereas an LSST-like survey will need ~ 100 bins. 
This figure also demonstrates the data compression properties of 
these statistics. In the case of a DES-like survey of order 50 shear 
correlation function data points can be compressed into only ~ 8 E- 
mode statistics which are measurably non-zero. An LSST-like sur- 
vey has increased statistical power, compressing of order 100 data 
points into only ~ 10 statistically significant E-mode statistics. Due 
to correlations between these remaining E-mode statistics, further 
data compression may be possible, but I will not explore this issue 
further in this work. 



4.2 Wavelet-like B-mode Size-location Estimators 

In this section, I give an example of the use of spatially-local basis 
functions to build sets of B-mode estimators which can pinpoint 
the size and location of B-mode systematics in shear correlation 
function data. Consid er the following function, know as the Ricker 
wavelet lRickeJl953h . 



3CT7T 1 / 4 





' t 2 ' 




2a 2 



(21) 



I take as a set of starting functions the following functions, indexed 

by{i,TV}, 



4>iN(t) = ip{t - iA N - Ajv/2 - InL, Ajv/8) 

t-lnL 

x exp 



(22) 



where A N = (In H — InL) /TV and i G {0,1,..., TV - 1}. This 
definition shifts and scales the location and size of the base wavelet 
so that an integer number TV of them fit in the interval [In L, In H] 
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and they have most of their support in each width A at subinter- 
val of the base interval Ai = In H — In L. Given this starting set 
of smooth functions, they are turned into E- and B-mode estima- 
tors by discretizing the functions over the interval [In L, In H] and 
then projecting out the F+ a and F + t modes as described above. 
The filters which result from this process will be denoted as F + Ni 
and F-Ni- Figure [3] shows an example set of these filters for 100 
correlation function data points between 1 and 400 arcmin. 

In order to gain intuition into how these B-mode estimators 
work, I show a simple, but somewhat con trived example (however 
see e.g jFu et al.ll2008l ; lEifler et al.ll2010h of two different kinds of 
B-mode systematics in Figure|4] The top row displays a smooth B- 
mode systematic in the correlation function, while the bottom 
row shows a B-mode systematic with much more fine scale varia- 
tions. These B-mode systematics were generated by setting Pb (I) 
to be non-zero using simple Gaussian kernels, the form of which is 
uninteresting, and then computing the shear correlation functions 
using Equations l[3) and j4j. The contour plots show for each value 
of TV on the vertical axis, the amplitude of the B-mode statistics 
for each Ajv sized interval across the horizontal axis. In this type 
of analysis, a B-mode signal with more fine scale variation has a 
different signature than the smoother B-mode signal, with more B- 
mode signal detected in higher TV and thus smaller sized filters. For 
data with shape noise, a similar contour plot can be made, except 
that the deviation from zero measured in units of the standard devi- 
ation should be used to produce the color scaling. The overall level 
of B-mode contamination can then be assessed via a \ 2 statistic 
computed over all of the different estimators indexed by {TV, i}, 
accounting for the correlations between the different estimators. 

Here I have chosen to layout the F±jsn filters in a simple ge- 
ometric pattern. However, when a B-mode signal crosses between 
two filters, the significance drops. Thus it might be more advanta- 
geous to consider sets of filters which shift and scale in size more 
continuously. I leave this generalization to future work. Note how- 
ever that the filters I have presented here are offset between dif- 
ferent levels TV so that in practice, no B-mode signal ever goes 
completely undetected. This general point can be clearly illustrated 
by examining the Fourier-space filters W+(£) introduced above for 
the wavelet-like F±Ni statistics. These filters for TV £ {2, 3, 4} are 
shown in Figure [5] The peaks and troughs for each TV correspond 
to a single location of the filter in real space, with TV peaks and 
troughs present for the TV-indexed filters. As the filters shift in real 
space over the shear correlation function range, they cover differ- 
ent, but overlapping ranges in Fourier space. Thus by combining 
filters of different size and location, any B-mode signals in Fourier 
space will produce a nonzero amplitude in these filters. 

4.3 Fisher Information Content and Full Mode 
Decomposition with COSEBI-like Statistics 

In this section I present the full ambiguous, E- and B-mode decom- 
positions discussed in Section [3?4l using the COSEBI-like statistics 
defined above. I also compute the Fisher information content of the 
approximate ambiguous, E- and B-mode subspaces for the param- 
eters as and Q m . I use a basis composed of 50 correlation function 
bins logarithmically spaced from 1 to 400 arcmin. I find that the 
full basis composed of these different spaces completely spans the 
space of 2TV data points, demonstrating that at least one such ap- 
proximate decomposition exists. 

Due to the fact that this decomposition is only approximate 
and that the matrices M± are not exactly orthogonal for the vectors 
F±, the different subspaces will in general not be perfectly orthog- 




^o' 

mutlipole I 



Figure 5. The Fourier space filters W+ (I) for the TV G {2,3,4} wavelet- 
like statistics computed for 8 £ [1,400] arcmin with 100 shear correla- 
tion function data points. The different colors show each of the filters. The 
W— (I) filters are approximately three orders of magnitude smaller than the 
W+ (£) filters. The different filters in size (the overall width of the peaks 
and troughs) and location (the mean location in £ for each set of peaks and 
troughs) approximately cover all of ^-space accessible given the angular 
range of the data. 



onal to one another. The degree of orthogonality of the ambiguous, 
E- and B-mode subspaces can be characterized by the maximum 
absolute normalized projection of a basis vector of any one of the 
spaces onto the basis vectors of the other spaces, 



| COS0| 



a ■ b 



\a\\b\ 



(23) 



where a and b are vectors of length 2TV with the normal inner prod- 
uct definition. This quantity is just the cosine of the minimum angle 
between the vectors of any of the subspaces with the other. For sub- 
spaces which nearly share a basis vector, this maximum absolute 
normalized projection will be w 1, but it cannot be greater than 
one or else two of the vectors in the basis would be linearly de- 
pendent. For spaces which are mutually orthogonal, this maximum 
absolute normalized projection will be zero. 

For the set of basis vectors just computed, I find that the 
maximum projection between the E- and B-mode subspaces is 
1.39 x 10~ . Similarly, the maximum projection between the E- 
mode and ambiguous subspaces is 1.51 x 10~ 3 and for the B-mode 
and ambiguous subspaces is 5.52 x 10 -3 . Consistent with the ar- 
gument made above, with equal area binning the E- and B-mode 
subspaces are more orthogonal with the maximum absolute nor- 
malized projection between them being 2.49 x 10 -2 . However in 
this case, the ambiguous modes are less orthogonal to the E- and 
B-mode subspaces with the maximum absolute normalized projec- 
tions being 1.31 x 10 _1 and 3.19 x lO^ 1 respectively. The partic- 
ularly poor orthogonality of the B-mode subspace to the ambigu- 
ous mode space in this case is reflected in the SVD of the matrix 
M+M_ — I computed with equal area binning. 

As mentioned above, the cost of E- and B-mode separation is 
that one mu st potentially throw away informatio n in the ambiguous 
modes (e.g.. lSmithl20 06; Schne ider et alj|2010|) . Using the approx- 
imate mode decomposition described in this section, this point can 
be illustrated clearly and approached quantitatively. I compute the 
Fisher information content of the ambiguous, E- and B-mode sub- 
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spaces and combinations of them assuming Gaussian statistics and 
using the expressions presented above for the covariances of the 
two-point correlation functions and Fisher information matrices. 
For the E- and B-mode statistics, I simply transform the covari- 
ance matrix of the two-point correlation functions, since the statis- 
tics are linear combinations of the two-point data. One can obtain 
similar results computing their covariance matrices directly from 
their Fourier space window functions and the expressions presented 
in Section [53] The covariance matrix of the ambiguous modes is 
computed by direct transformation as well. 

The Fisher information content / = \f\F\ of these subspaces 
for the parameters as and fi m using the binning scheme and angu- 
lar range for the basis constructed above and also assuming DES- 
like errors is 2.18 x 10 4 , 3.49 x 10 4 , and 4.49 x 1CT 1 for the 
ambiguous, E- and B-mode subspaces respectively. By compari- 
son, the full Fisher information for the entire two-point function 
data set, for i G [i, N], is 5.98 x 10 4 . Note that these 

figures are not additive and so can only be qualitatively compared. 
However the basic trends are clear. The B-mode subspace has very 
little information on the parameters as expected. Any residual infor- 
mation is either a numerical artifact or is due to E/B-mode mixing. 
The E-mode subspace has most of the information, but the ambigu- 
ous modes carry a non-trivial fraction of the total information. By 
retaining the ambiguous modes in the analysis with the E-modes, 
most of the information can be restored in comparison with the full 
two-point function with the E- plus ambiguous mode Fisher infor- 
mation being 5.75 x 10 4 . 1 have tested several other fiducial choices 
besides the COSEBI-like statistics for constructing the full mode 
decomposition of the correlation function data and found similar 
conclusions regarding the relative Fisher information content of the 
various subspaces. Similar conclusions hold for LSST-like surveys 
as well. While these Fisher information calculations are not partic- 
ularly realistic as survey projections, they illustrate nicely the cost 
of E/B-mode separation in cosmic shear. 

Finally, these Fisher information calculations can also illus- 
trate the data compression properties of the COSEBI-like statistics. 
In particular, for the DES-like survey, the Fisher information in the 
first eight E-mode statistics is 3.46 x 10 4 as opposed to 3.49 x 10 4 
for all of the E-mode statistics. Similarly, for the LSST-like survey 
using 100 correlation function bins in the same angular range as the 
DES-like survey, the first ten E-mode statistics have an information 
content of 6.489 x 10 5 as opposed to 6.496 x 10 5 for all of the 
E-mode statistics. These result s are in qualitative agreement with 
those of lSchneider et alj fcOld) . 



5 CONCLUSIONS 

In this work, I have demonstrated that the use of two-point E/B- 
mode estimators with binned shear correlation function data will 
generally result in non-trivial E/B-mode mixing. I computed the 
amount of this mixing and provided new E/B-mode estimators 
which minimize the unwanted mode mixing. I also gave practical 
recipes for building and using these estimators with binned cos- 
mic shear data. Using these optimal estimators, I demonstrated that 
approximate decompositions of the vector space of binned correla- 
tion function points into ambiguous, E- and B-modes do exist. Us- 
ing one of these decompositions, I found that the ambiguous mode 
subspace contains a non-trivial amount of information on typical 
cosmological parameters. I also gave two example applications of 
these new estimators to generic problems in cosmic shear data anal- 



ysis, data compression (Section [4. It and B-mode localization with 
wavelets (Section|4]2j. 

The estimators presented here have several nice features 
adapted to practical cosmic shear data analysis. 

• They are linear combinations of the binned shear correlation 
function data, defined in Equation 10, and thus treat the binning 
explicitly. This property makes their computation and also error 
propagation/analysis with them trivial once the shear correlation 
function and its errors are known. 

• The level of E/B-mode mixing due to the binning can be com- 
puted exactly, up to the knowledge of the binning window func- 
tions, for these estimators using Equations 10 and I0. In the limit 
of small bins, which is needed to suppress the E/B-mode mixing, 
the binning window functions are expected to be close to the geo- 
metric approximation used throughout this work. 

• They give quantitative criteria in terms of the E/B-mode mix- 
ing by which to decide the number of shear correlation function 
bins to use, demonstrated in Figure|2]for the COSEBI-like estima- 
tors. 

• The design of new E/B-mode estimators with specific pur- 
poses using the formalism presented in this work reduces to simple 
linear algebraic manipulations, presented in Section[3j2] 

The optimal statistics presented in this work are well-suited to 
blinded or closed-box, high-precision cosmic shear data analyses. 
For example, before any shear correlation functions are computed 
from the data, one can estimate given the expected statistical accu- 
racy of the data, the exact amount of E/B-mode mixing for a given 
binning scheme and set of estimators. One can then choose a fidu- 
cial binning scheme and estimator choice that properly minimizes 
the E/B-mode mixing and retains all of the cosmological informa- 
tion. Then these choices can be fixed throughout the data analysis 
process in order to avoid any observer biases in detecting or assess- 
ing B-mode contamination arising from how exactly the E/B-mode 
separation was done. Additionally, in a blinded or close-box anal- 
ysis one might not look at plots of the shear correlation function 
data until the entire analysis is complete. Unfortunately in this case, 
one might miss crucial information about potential systematics in 
the shape of the shear correlation function data. The wavelet-like 
B-mode estimators presented in this work can be used as a sub- 
stitute for and quantitative measure of the information gained by 
looking at the shape of the shear correlation function data. Addi- 
tionally, they can be applied in automated way to large data sets 
in order to pinpoint areas of potential systematic contamination for 
further investigation. Importantly, because these estimators have a 
large degree of spatial locality, they can potentially provide crucial 
information on where any B-mode contamination is coming from 
and not just indicate its existence. 

The ability to easily and quickly design E/B-mode estima- 
tors tailored to a specific purpose can potentially be very useful 
in practice as well. For instance, the problem of computing an E/B- 
mode statistic which maximizes the signa l-to-noise or cosmologi - 
cal information content, as considered bv lFu & Kilbinged ( l2010h . 
could now be reformulated with the linear estimators presented 
here. Additionally, one could build estimators which are along the 
normal modes of the correlation function data computed from the 
correlation function covariance matrix. Also, one could attempt 
to build direct estimators for the E/B-mode correlat ion functions 
dCrittenden etal]|2002l ; [Schneider et al]|2002bl. [2oToh over a finite 
interval. Alternatively, one could attempt to build filters which are 
localized in Fourier space in order to exclude certain wave modes, 
for example to mitigate the effects of uncertainty in the matter 
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power spectrum at small scales (see e.g. jHuterer & Whitel2005l) . or 
to be sensitive to B -modes from only a range of I. As simple linear 
combinations of cosmic shear data points, the estimators presented 
in this work are well-suited to these applications and to practical 
cosmic shear data analysis in general. 
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weighting term, the survey window function contributes an addi- 
tional weight over the bin. The form of this weighting function is 
highly non-trivial, but for small enough bins, this weighting should 
be negligible. However, quantitative results describing how small 
the bins need to be in order to suppress this weighting require de- 
tailed survey simulations, which are beyond the scope of this work. 
Thus for simplicity I neglect the survey window function weights. 
A simple estimate of the magnitude of the effect of source galaxy 
clustering using typi cal galaxy angular correlation functions (e.g., 
IConnollv et alJl2002T) shows that this effect (i.e., the difference of 
the last two terms in the bracket above) is < 0.005 when using at 
least five bins per decade. To get this number I assumed the galaxy- 
galaxy angular correlation function was a power law over the angu- 
lar range of [1, 400] arcmin with £ 00 = 0.0045 x (0/1 deg) -0 ' 7 , 
consistent with the results o f lConnollv et aD J2002h . 



APPENDIX A: PAIR- WISE SHEAR CORRELATION 
FUNCTION ESTIMATORS AND FINITE BIN WIDTHS 



I use the formalism presented in Appendix A of I Schmidt et alj 
d2009h to derive the leading order bin weighting terms in the pair- 
wise estimator for the shear correlation functions. This estimator is 
(e.g JSchneider et alj|2002al) 

? J2ij WiWjWg k (Xi, Xj)(e it €jt ±e ix e jx ) 
?±fc = 



£\ WiWiWe k {xi,Xj) 



(Al) 



where W$ k (xi, X2) is the binning function which is unity inside 
the bin and zero outside with the bin centered at 9k with some bin 
width A9, eu.ix is the component of the galaxy shape parallel or 
crossed with respect to the great circle connecting the two galaxies, 
and the Wi,j are weights applied to each galaxy. Below I include the 
weight s Wj,j in the survey wi ndow function. Starting with Equation 
A15 of ISchmidt etail J2009h and assuming that the source galaxy 
density field is uncorrelated with the shear field and also neglect- 
ing lensing bias/reduced shear effects, the expectation value of this 
estimator can be written as 



(U) = 



1 



j2 

a Xi 



X £ab(\xi - Z2|) 



d 2 X2 Wg(xi,X2)S(xi)S(X2) 



1 + ^99(1^1 - Z2I) 



d 2 X3 / d 2 X4,We(X3,X4,)S(xi)S(X2) 



X£ss(|lE3 -Xi\) 



(A2) 



where £ a i is the shear correlation function for a, b 6 {t, x}, 
Qb = J d 2 xi J d 2 X2We{xi,X2)S(xi)S(x2) is the survey av- 
eraged bin area, and £ 99 is the source galaxy angular correla- 
tion function. S(xi) is the survey window function including the 
weights Wij. The second term in the brackets arises directly from 
the weighting over the bin by the sampling density of the source 
galaxies and the third term in the brackets is the first non-trivial 
term in the power-series expansion of the denominator of the esti- 
mator (i.e., the total number of observed galaxies in the bin). The 
last two terms in th is integral do not exactly cancel as stated in 
ISchmidt et al.1 d2009h because £ sg is not exactly equal to its average 
over the bin for all 6 in the bin. In addition to this source clustering 



APPENDIX B: OPTIMAL E/B-MODE ESTIMATORS 
WITH GEOMETRIC BIN WEIGHTING FUNCTIONS 

In this Appendix, I give the exact form of the constraint directions, 
F+ a and F + t, and the matrix to compute the F^i from the F+i un- 
der the assumption of geometric bin weightings. Consider TV bins 
in angle 6 from L to H and let [Li, Hi] be the angular range of the 
ith bin. Also, assume the bins are non-overlapping. Then using the 
geometric bin weig hting function, Wi{0) = 26/{H 2 - L 2 ), I get 
for the constraint direction vectors (see Section fOt 



F+a = (l,l,l,...,l) w 

( Hf -Li 



\2{Ht-L\Y2{Hl-Liy2{Hl-Liy 



2(H_ 



(Bl) 



(B2) 



The matrix to compute the F- k in terms of the F+i from Equa- 
tion d 1 2b is 



(M 



+)ki 



Ski + 



'Hf-Lf) In (H k /L k ) 



x < 



if i < k 

-\ (H 2 k - L\) - 2L 2 In (H k /L k ) (B3) 



-4Lf 







if i = k 
if i > k 



where i and k run over 1 to N. Then the F-k are computed as 

N 

F^ k = ^2(M) kt F +l . (B4) 



APPENDIX C: OPTIMAL E/B-MODE ESTIMATORS 
WITH F- FIXED 

One can easily define optimal E/B-mode estimators with F- fixed 
instead of F+ fixed. In this case one fixes the F+ by minimizing 



= 



t) 



dF- 



+k 



dtt\W-{l)\ 



(CI) 
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The solution to this equation is (c.f. Equationll2t 



F +k = F- 



dO 



Wi(6) 



E 



F- 



d0d<t>Wi{0)W k {4>) 



e 2 



Y2^ 
<9 4 



H(9 - 0) . 



(C2) 



In this case the constraint directions can be derived by similar ar- 
guments to the those in Section [3~2l and are 



F— a = 



Hi 



(10 



Li 



Fu = 



Hi 



JO 



Wi(6) 

e 2 



Wi(0) 

e 4 



do 



W 2 {9) 



H N 



(C3) 



L N 

e 4 



(C4) 



L N / N 

and Equation dC2b again defines a matrix which is used to compute 
the F+ in terms of the F- . Finally, for the geometric bin weighting 
functions the constraint directions and matrix relating the F+ to F- 
are 



F- a = 



21og[Hi/Li] 2\og[H 2 /L 2 
H! - LI 



Hi -LI ' 
2\og[H N /L, 



H% - L 2 N 



(C5) 



F- h = 



1 1 
Hf -LI \LJ~Hj 



H 2 — T 2 



H\ — L\ \L\ 



T 2 



ti- 



ll 2 , 

U N / / N 



(C6) 



and 

(M-) ki =S ki + 




H 1 



x < 



-Hl + L 2 (4-3LI/H 2 

-A\n{Hi/L k )) 
2{Hl-Ll)\n{Hi/Li) 
+l(Ht-Lt) 



if i < k 



ifi = k 



if i > k 



(CI) 



where i and k run over 1 to N. Then the F+ k are computed as 



F +k =^2(M) ki F- 



(C8) 
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